Student Name : Harshvardhan Tyagi
Student Id : 8856516
Create at least 4 graphs using either Matplotlib, Seaborn or Plotly (or a combination of these). You can use the examples provided in the Galleries that these packages provides (links appear in our presentation about visualizations)
Matplotlib: Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Matplotlib makes easy things easy and hard things possible.
Create publication quality plots. Make interactive figures that can zoom, pan, update. Customize visual style and layout. Export to many file formats. Embed in JupyterLab and Graphical User Interfaces. Use a rich array of third-party packages built on Matplotlib.
Import the necessary library: Create a figure and axes for the plot:
Define the "broken" horizontal bar segments using ax.broken_barh(). This function takes a list of tuples, where each tuple represents a segment. The first element of each tuple is the starting position, and the second element is the width of the segment. The third argument specifies the vertical position of the segments.
Set the y-axis limits and x-axis limits using ax.set_ylim() and ax.set_xlim(), respectively.
Add labels to the x-axis and modify the y-axis tick labels using ax.set_xlabel() and ax.set_yticks().
Make grid lines visible using ax.grid(True).
Annotate the plot with text using ax.annotate(). This places text at the specified coordinates with an arrow pointing to it.
Finally, display the plot using plt.show().
import matplotlib.pyplot as plt
# Horizontal bar plot with gaps
fig, ax = plt.subplots()
ax.broken_barh([(110, 30), (150, 10)], (10, 9), facecolors='tab:blue')
ax.broken_barh([(10, 50), (100, 20), (130, 10)], (20, 9),
facecolors=('tab:orange', 'tab:green', 'tab:red'))
ax.set_ylim(5, 35)
ax.set_xlim(0, 200)
ax.set_xlabel('seconds since start')
ax.set_yticks([15, 25], labels=['Bill', 'Jim']) # Modify y-axis tick labels
ax.grid(True) # Make grid lines visible
ax.annotate('race interrupted', (61, 25),
xytext=(0.8, 0.9), textcoords='axes fraction',
arrowprops=dict(facecolor='black', shrink=0.05),
fontsize=16,
horizontalalignment='right', verticalalignment='top')
plt.show()
Plotly
Plotly is a popular and powerful data visualization library in Python that allows you to create interactive and visually appealing charts, graphs, and plots. It is often used for data exploration, data analysis, and data presentation in various fields, including data science, machine learning, finance, and scientific research. 1.Performance Tips: These tips are aimed at optimizing the performance of your Plotly charts, especially when dealing with large datasets or complex visualizations. They might include suggestions for reducing data size or using aggregation techniques.
2.Interactivity Tips: Plotly is known for its interactive charts, and tips in this category focus on enhancing user engagement. They might suggest adding hover labels, click interactions, or zoom/pan features to your charts.
3.Styling and Customization Tips: Plotly allows for extensive customization of chart appearance. Tips in this category can help you choose color schemes, adjust fonts, and style your charts to align with your project's visual identity.
4.Best Practices: These tips provide guidance on adhering to best practices in data visualization. They might cover topics such as choosing the right chart type for your data, avoiding clutter, and ensuring your charts are easy to understand.
import the plotly.express library as px.
load the "tips" dataset using df = px.data.tips(). This dataset contains information about restaurant tips, including the day of the week and the total bill.
create a box plot using fig = px.box(df, x="day", y="total_bill"). This code specifies that you want to create a box plot with the "day" column on the x-axis and the "total_bill" column on the y-axis. The box plot shows the distribution of total bills for each day of the week.
Finally, display the plot using fig.show(), which opens a window or renders the plot depending on your environment.
import plotly
plotly.offline.init_notebook_mode()
import plotly.express as px
# using the tips dataset
df = px.data.tips()
df # df stores the inbuilt dataset (Tips)
| total_bill | tip | sex | smoker | day | time | size | |
|---|---|---|---|---|---|---|---|
| 0 | 16.99 | 1.01 | Female | No | Sun | Dinner | 2 |
| 1 | 10.34 | 1.66 | Male | No | Sun | Dinner | 3 |
| 2 | 21.01 | 3.50 | Male | No | Sun | Dinner | 3 |
| 3 | 23.68 | 3.31 | Male | No | Sun | Dinner | 2 |
| 4 | 24.59 | 3.61 | Female | No | Sun | Dinner | 4 |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 239 | 29.03 | 5.92 | Male | No | Sat | Dinner | 3 |
| 240 | 27.18 | 2.00 | Female | Yes | Sat | Dinner | 2 |
| 241 | 22.67 | 2.00 | Male | Yes | Sat | Dinner | 2 |
| 242 | 17.82 | 1.75 | Male | No | Sat | Dinner | 2 |
| 243 | 18.78 | 3.00 | Female | No | Thur | Dinner | 2 |
244 rows × 7 columns
# plotting the box chart
fig = px.box(df, x="day", y="total_bill")
# showing the plot
fig.show()
Seaborn is a Python data visualization library based on Matplotlib. It provides a high-level interface for creating attractive and informative statistical graphics. Seaborn is particularly useful for visualizing complex datasets with concise and aesthetically pleasing plots. 1.Built-in Themes and Color Palettes: Seaborn comes with several built-in themes and color palettes that make it easy to create visually appealing plots.
2.Integration with Pandas: Seaborn works seamlessly with Pandas DataFrames, making it convenient to work with data loaded from various sources.
3.Statistical Estimations: Seaborn simplifies the process of creating plots that involve statistical estimations, such as bar plots with error bars or regression plots.
import the necessary packages, Seaborn and Matplotlib.
load the Iris dataset using sns.load_dataset("iris").
create a line plot using sns.lineplot(x="sepal_length", y="sepal_width", data=data). This line plot shows the relationship between sepal length and sepal width for the Iris dataset.
set the x-axis limit using plt.xlim(5), which limits the range of the x-axis to start from 5.
Finally,display the plot using plt.show().
This code generates a line plot that visualizes the relationship between sepal length and sepal width in the Iris dataset, with the x-axis limited to start from 5.
# importing packages
import seaborn as sns
import matplotlib.pyplot as plt
# loading dataset
data = sns.load_dataset("iris")
data # data stores the inbuilt dataset load_dataset
| sepal_length | sepal_width | petal_length | petal_width | species | |
|---|---|---|---|---|---|
| 0 | 5.1 | 3.5 | 1.4 | 0.2 | setosa |
| 1 | 4.9 | 3.0 | 1.4 | 0.2 | setosa |
| 2 | 4.7 | 3.2 | 1.3 | 0.2 | setosa |
| 3 | 4.6 | 3.1 | 1.5 | 0.2 | setosa |
| 4 | 5.0 | 3.6 | 1.4 | 0.2 | setosa |
| ... | ... | ... | ... | ... | ... |
| 145 | 6.7 | 3.0 | 5.2 | 2.3 | virginica |
| 146 | 6.3 | 2.5 | 5.0 | 1.9 | virginica |
| 147 | 6.5 | 3.0 | 5.2 | 2.0 | virginica |
| 148 | 6.2 | 3.4 | 5.4 | 2.3 | virginica |
| 149 | 5.9 | 3.0 | 5.1 | 1.8 | virginica |
150 rows × 5 columns
# draw lineplot
sns.lineplot(x="sepal_length", y="sepal_width", data=data)
# setting the x limit of the plot
plt.xlim(5)
plt.show()
created a bar chart using Matplotlib to visualize the supply of different fruits by kind and color. The chart displays the fruit types on the x-axis, their respective counts on the y-axis, and uses different colors to represent the fruit colors. Additionally, we have added labels and a legend to make the chart more informative.
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
fruits = ['apple', 'blueberry', 'cherry', 'orange']
counts = [40, 100, 30, 55]
bar_labels = ['red', 'blue', '_red', 'orange']
bar_colors = ['tab:red', 'tab:blue', 'tab:red', 'tab:orange']
ax.bar(fruits, counts, label=bar_labels, color=bar_colors)
ax.set_ylabel('fruit supply')
ax.set_title('Fruit supply by kind and color')
ax.legend(title='Fruit color')
plt.show()